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A new notion of probabilistic team inductive inference is introduced and 
compared with both probabilistic inference and team inference. In many cases, but 
not all, probabilism can be traded for pluralism, and vice versa. Necessary and 
sufficient conditions are given describing when a team of deterministic or 
probabilistic learning machines can be coalesced into a single learning machine. A 
subtle difference between probabilism and pluralism is revealed. © 1988 Academic 


Press, Inc. 


I. INTRODUCTION 


Inductive inference has been studied by computer scientists as a general 
theory of learning by example (Angluin, 1983) with potential applications 
to artificial intelligence (Angluin, 1987). As in previous studies, we view the 
inference process as a limiting one where some algorithmic device (called 
an inductive inference machine, or IIM) continually inputs more and more 
ordered pairs from the graph of some function and while doing so, outputs 
more and more programs, each conjectured to compute the function which 
is being used as input. In (Blum, 1975) it is shown that changing the order 
of the input does not affect what can and cannot be inferred by the class of 
IIMs. 

Several notions of what it means for an IIM to successfully infer a 
function are defined and compared in (Case, 1983). A particular notion of 
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success is called a criterion of success. An inference is always performed 
with respect to some criterion. In the technical sections of this paper, the 
particular criteria being considered will be explicitly referenced. Inference 
without reference to a specific criterion is synonymous with inference with 
respect to some arbitrary criterion. Unless specified otherwise, the 
unnamed criterion will not vary within a paragraph. 

Pluralism, or team inference, was introduced in (Smith, 1983). A team 
succesfully infers a set S of recursive functions if for each fe S, some IIM in 
the team successfully infers f. Different members of the team may succeed 
on different member of S. Suppose that some team of n IIMs can infer a set 
S. Then all the functions in S can be inferred with probability 1/n by a 
single ITM that guesses one of the n machines in the team to emulate. The 
converse also holds (Pitt, 1984), in that if an IIM infers a set S with 
probability p, for 1/(2+1}< p<1/n, then S can be inferred by a team of 
size n. 

The general nature of the approach to inductive inference used in this 
paper renders the results obtained applicable to several problem domains 
including linguistics (Osherson, 1986a). A problem concerning team 
inference has recently attracted the attention of the linguists (Osherson, 
1986b). The previously mentioned study of pluralism concerned teams, 
where 1 out of n IIMs were correct. In (Osherson, 1986b) the following 
question was raised. How many machines, out of n, must be correct in 
order to be able to uniformly and effectively aggregate the n machines into 
a single deterministic machine that succeeds everywhere that the original 
team did? Using the results from (Pitt, 1984; Smith, 1982) we show that 
the answer is that the aggregation is possible only when more than half of 
the n machines in the team succeed. 

Several interesting questions are suggested by the notion of aggregation. 
Below we formally define an ostensibly new notion of team inference 
where, for success, we demand that at least m out of the n machines suc- 
ceed. The power of each such team is equated with some 1 out of n’ team. 
Consequently, by the results from (Pitt, 1984), every m out of n collection 
of IIMs is equivalent in power to some single probabilistic machine. Hence, 
it would appear that one can always trade probabilism for pluralism and 
vice versa. However, this turns out to be true only when one converts all of 
the pluralism into a probabilistic machine or when one takes a 
probabilistic inference machine and converts it into a completely deter- 
ministic team. This conclusion is reached by considering teams of 
probabilistic machines. We will show, for example, that the power of a 
single inference machine that succeeds with probability $ (=2-4) is 
equivalent in power to a team of 5 IIMs, where at least 3 always succeed 
with probability 4 but is different from the power of a team of 7 IIMs, 
where at least 2 always succeed with probability 3. We characterize the 
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power of teams of probabilistic machines with respect to teams of deter- 
ministic machines. Our results indicate a subtle difference between 
pluralism and probabilism. 


II. NOTATION, DEFINITIONS, AND PRIOR RESULTS 


This paper is concerned with learning programs that compute recursive 
functions. Members of N, the natural numbers, will serve as program 
names. The positive natural numbers will be denoted by N*. Øo, g,... is 
an acceptable programming system (Machtey, 1978) containing all and only 
the partial recursive functions of a single argument. The acceptability 
means that certain natural properties hold for the chosen enumeration of 
the partial recursive functions. Program i computes the function ;. f and g 
will be used to denote recursive functions for which no program for com- 
puting them is yet known. S€ denotes subset and < denotes proper subset. 
Suppose an IIM M is given the graph of f as input. We may suppose 
without loss of generality that f is given in its natural order (f(0), f(1), ...) 
to M (Blum, 1975). M will output a (possibly infinite) sequence of 
programs po, Pi, ..., each of which may or may not compute f. M is said to 
converge on input from f (written: M(f) | ) if either the sequence po, pi, ... 
is finite or there is an n such that for all n’ >n, p,, = pa. M(f)| = p, means 
that either the sequence of output programs is finite with length n + ! or all 
but the first n programs in the sequence are precisely p,,. 

Gold (1967) introduced a criterion of successful inference called “iden- 
tification in the limit.” This notion will be called £X identification. An IIM 
M EX-infers f (written: fe EX(M)) iff M(f)| p and ọ,= f. Each IIM will 
EX infer some nonrecursive set of recursive functions. EX denotes the class 
of such sets, e.g, EX = {S|(4M)[S¢ EX(M)]}. EX stands for “explain,” a 
term consistent with the philosophical motivations for the study of induc- 
tive inference, see (Case, 1983). 

For EX inference, the machine must produce a program that is correct 
on all inputs. This makes the inference more difficult, or perhaps as 
suggested in (Valiant, 1984), impractical. A partial recursive function w is 
an n-variant of a recursive function f (written: y =" f) if the cardinality of 
({xl W(x) fT Y {x] W(x) | Af(x)}) <n. y is a finite variant of f (written: 
w=* Nif (Lx W(x) T} Y (xl W(x) | #f(x)}) is finite. For any aeN u {x}, 
an IIM M EX*-infers f (written: fe EX°(M)) iff M(f)| p and 9, =" f. 
Similarly, for any ae Nu {+}, EX? = {S|(AM)[SS EX%(M)]}. Note that 
EX? = EX. EX* inference was introduced in (Blum, 1975) and EX? 
inference, for a # *, was introduced in (Case, 1983). In the inequalities used 
to state the hypothesis of some of the theorems below, x is considered to be 
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greater than or equal to any member of N. The following theorem 
establishes an anomaly hierarchy for EX. 


THEOREM 1 (Case, 1983), EX? c EX'c ... cCEX*. 


Given an IIM M and a function f, it is not possible to determine how 
much of f M must see before it converges, if in fact M(f)|. In practice 
then, one would not wait for an HM to converge. The most recent conjec- 
ture of the IIM would be used as if it were the final one. These con- 
siderations, expressed in (Feldman, 1972) led Barzdin (1974) to discuss a 
notion called behaviorly correct inference. Suppose M on input from the 
graph of f outputs the conjectured programs po, p,,.... Then M BC-infers f 
(written: feBC(M)) if for all but finitely many n, 9, =f. 
BC = {S| (GM)[S = BC(M)]}. 


THEOREM 2 (Case, 1983). EX* c BC. 


Behaviorly correct inference with anomalies has been investigated (Case, 
1983). There is also an anomaly hierarchy for BC. The results of this paper 
apply only to BC inference and the inference criteria in the anomaly 
hierarchy for EX. In what follows, let # denote an arbitrary inference 
criterion in {EX°, EX', ..., EX*, BC}. 

This paper is primarily concerned with collections of IIMs. For m and n 
in N*, a set of functions S is in the class [m, n] iff m <n and there exist 
IIMs M,,M,,..,M, such that for each function feS there are 
1 <i, <i,< ++. <i,,<n such that for all 1<j<m, fe ¥(M,). In other 
words, m out of the n inference machines Z infer each fin S. Suppose that 
Se[m, n] J as witnessed by M,, M>,..., M,,. Different functions in S may 
be f inferred by different size m subsets of {M,, M),.., M,,}. Furthermore, 
the machines {M,, M2, .... M,,} may all converge to different programs for 
the function being inferred. The definition of [1, n] Z appeared in (Smith, 
1982). As an immediate consequence of this definition we have that 
[1,1] =.. As an almost immediate consequence of the definition we 
have the following. 


PROPOSITION 3. For any keN*, for all J e {EX°, EX", ..., EX*, BC}, 
[k,k] =. 


Proof. Choose ke N* and .¥ an inference criterion. Suppose Se. as 
witnessed by M an IIM. For 1 <i<k, let M;= M. Then M,, Ma, -n Mg 
witness that Se[k,k].%. On the other hand, suppose Se[k,k]/ as 
witnessed by M,,M),..,M,. Then, by definition, S=%(M,)n 
IIM) --- na AM). Then S&.4(M,). Hence, SeS. J 


The following results concerning team inference are used below. The first 
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result establishes a hierarchy based on pluralism for each criterion con- 
sidered in this paper. The second result shows how pluralism trades off 
against the EX type inference criteria. 


THEOREM 4 (Smith, 1982). For all J € {EX°, EX', ..., EX*, BC}, for all 
neN+*, [l,a] 4c[i,n+1]-%. 


THEOREM 5 (Smith, 1982). For all m,neN+*, for all aœabeNu {>}, 
[1, m] EX c [1,n] EX? iff (m<n and a<b) or (a,b,m,neN and 
n>m.-(1+La/(b+1)]). 


The “(m<n and a<b)” clause above is necessary only for the cases 
when a=+* and/or b= x+. It is interesting to note, although not crucial to 
the discussions below, that precisely the same trade-off formula describes 
the relatioships between the number of machines and the number of 
anomalies for the BC anomaly hierarchy (Daley, 1983). The EX and BC 
team hierarchies are related by the following results. 


THEOREM 6 (Smith, 1982). For all m,neN*, for all aeNvu {x}, 
[1, m] EX’ € [1, n] BC iffme<n. 


THEOREM 7 (Smith, 1982). For allneN*, BC—[1, n] EX* # Ø. 


As promised in the title, this paper addresses the issue of probabilistic 
inductive inference. Inference machines that have a fair coin to toss were 
investigated and characterized in (Pitt, 1984). Suppose that M is an IIM 
that has a fair coin to toss that is trying to learn a program for the function 
f. If a particular order for the enumeration of the graph of f is fixed, then 
the outcome of M applied to f depends only on the results of the coin 
tosses. Using the standard Borel measure on the (infinite) sequences of coin 
tosses possible, the set of sequences for which M(f)} to a program for f is 
measurable. In this sense, for Z € { EX°, EX’, ... EX*, BC}, M an IM, and 
O<p<il, we say that fe. ¥¢p)>(M) if M Z infers f with probability p. 
The classes #(p) are defined analogously. For 0O<p<1, %<(p>= 
{S|(AM)[SS.4<p>(M)]}. Note that, if p<p', then %<p'>oF<p). 
Most of our results below are based upon the following theorem concern- 
ing probabilistic inference. 


THEOREM 8 (Pitt, 1984). For all J e {EX°, EX", .., EX*, BC}, for all 
neN*, if 1/(n+1)<p<l1/n then $< p> =[1,n] I. 


The discrete nature of the probabilistic inference hierarchies has been 
noted in other studies involving inference and probability (Freivalds, 
1979a, 1979b, Podnieks, 1975, 1977, Wiehagen, 1984). The notion of inter- 
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val is germane to the study of probabilistic inference. Consequently, 
we define an interval function IN such that for all O0<p<l, 
N(p)=1/L!/p|=1/n, where n is the unique natural number such that 
(n+ 1)<p<t/n Note that IN(IN(p)) =IN(p). 


COROLLARY 9 (Pitt, 1984). For allQO< p<1, %<p>=L[l, L/IN(p)] Z. 


Proof. Immediate from the definition of IN and from Theorem 8. 


The following observation will be useful in Section IV. 


Proposition 10. For all O<p, p'<1, for all %€{EX°, EX’,.., 
EX*, BC}, F< p> SIF <p'> iff IN(p) 2 IN(p’). 


Proof. Suppose %¢ p> ¢.¥#¢p'>. Then, by Corollary 9, 


1 1 
E IN) no | EDS E nol’ 


It follows easily from Theorem4 that 1/IN(p)<1/IN(p‘), hence 
IN(p) > IN(p’). 

Conversely, suppose that IN(p) > IN(p'). Then 1/IN(p) < 1/IN(p’), and 
by Theorem 4 and Corollary 9 


1 
$0>=[ hang YE [bing oe 


A straightforward and natural combination of the notions of pluralistic 
and probabilistic inference results in the definition of some ostensibly new 
classes of inferable functions that are studied in this paper. For m and n in 
N+ and 0< p<1, a set of functions S is in the class [m,n] # <p> iffm<n 
and there exist probabilistic IIMs M,, M,,..,M, such that for each 
function fe S there are 1 <i, <i,< --- <i,,<m such that for all l<j<m, 
fe F< p>(M,). Essentially, we have defined teams of size n that succeed on 
a given function f if and only if at least m of the machines in the team 
succeed on f with probability at least p. The mathematics of this paper is 
concerned with characterizing the classes [m, n] .%< p> for various m, n, $, 
and p with one another and other previously studied inference classes. The 
basic question we seek to answer is: Can one always trade pluralism for 
probability (and vice versa) as Theorem 8 would suggest? We answer this 
question negatively. 
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The nonunion theorem of (Blum, 1975) indicates that pairs of inference 
machines cannot be uniformly combined into a single machine capable of 
inferring any function that could be inferred by either of the original pair of 
machines. Given some team of inference machines, it is sometimes possible 
to combine, or aggregate, them into a single ITM that is as powerful as the 
original team. The amalgamation technique of (Case, 1983) is one such 
method of successfully aggregating IIMs. This technique will succeed only 
when all the machines in the team to be aggregated are known to output 
only finitely many programs. 

Suppose IIM M is simulating M,, M3, ..., M,, on input from the function 
f. For 1<i<n, let p; denote the most recent output by M; on the finite 
portion of f received so far. M then runs program p; on all inputs x for 
which the value f(x) is known. If M finds that 9,(x)# f(x), then M 
ignores the output of M, until M; outputs a program different from p,. M 
then outputs a program q that on input x, dovetails the computations of all 
the programs p,, P2, --, Pn that it is not ignoring until one of them con- 
verges. Program q on input x then outputs the results of the convergent 
program. Any program that produces a wrong value is eliminated from the 
amalgamation. As long as one program in the amalgamation is correct, the 
amalgamated program will be correct. Every time one of the machines 
M,, Mj,.., M, outputs a new program, the amalgamation process starts 
again. Essentially the same technique was also used in (Valiant, 1984). It 
turns out, as our first result indicates, that any collection of deterministic 
or probabilistic IIMs can be aggregated into a single probabilistic IM. 


ProposiTION 11. For all m<neN*, 0<p<1, and S e {EX?, EX", ..., 
EX*, BC}, [m,n] 5 <p> S F< (m/n)-p>, and [m,n] S SF <m/n). 


Proof. We show that [m,n] 4<(p><4<(m/n)-p>, and the second 
result follows analogously. Suppose m<n, p<i, Æ is an inference 
criterion, and M,, M,,.., M„ are (probabilistic) IIMs witnessing that 
Se[m,n]}].%Cp> for some set of functions S. Let M be an HM that 
behaves as follows. Once started, before receiving any input, M randomly 
chooses an i such that 1<i<n. M then proceeds to simulate M,. 
Clearly, SS %<(m/n)-p>(M). Hence, Se.%<(m/n)-p>, and therefore 
[m,n] F< p>SI#(m/n)-p>. | 

We will call the machine M constructed above the probabilistic 
amalgamation of the machines M,, M),....M,. Proposition 11, with the 
results reviewed in the previous section, can be used to equate every “m out 
of n” team of IIMs with some “1 out of n” collection: 
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THEOREM 12. For all m<neN*, for all J e {EX°, EX’, .... EX*, BC}, 
[m,n] %=[1, Ln/m]] I = %<IN(m/n) >. 


Proof. The second equality follows from the definition of IN and 
Corollary 9. We prove the first equality. Suppose m<n and J is an 
inference criterion. By Proposition 11, [m,n]%<.%<m/n>. By simple 
calculation we have 


1 ge 1 
[nim|+1 on Lanm] 


Hence, by Theorem 8, ¥ (m/n)> = [1, Ln/m_|]%. Consequently, [m,n] I = 


[1, Ln/m J] 4. 
Again by Theorem 8, 


1 n 
f T -|1 Fal < 
Given any collection of k IIMs and any constant c, by making c copies of 
each of the k machines, we have [1,k].%<[c,c-k].%. Consequently, 
[1, La/m]] S € [m, m-|n/m |] A. If a team of IMs is enlarged by adding 
some more inference machines to the collection, then possibly some more 
functions become inferable by the team. In any event, the enlarged team 


will infer all the functions that the original team did. Since m. |n/m]<n, 
[m,m-|n/m]] % € [m,n] .%. Hence, 


s (i bmn] J. | 


The original motivation (Osherson, 1986b) behind the study of m correct 
inferences out of n machines was to be able to answer the question of when 
a team of IIMs could be combined into a single machine. The above 
theorem enables us to answer this question completely. 


COROLLARY 13, For all m<neN*, for all J e{EX?, EX', 
EX*, BC}, [m, n] 4 = iff m/n > 1/2. 


Proof. Suppose m<n and ¥ is an inference criterion. Suppose further 
that m/n > 1/2. Then 2>n/m so | n/m|=1. By Theorem 12, [m,n] I = 
(1, L n/m]] ¥ =[1, 1] I = S. 

Suppose m/n < 1/2. Then 2<n/m. Hence, [1,2] Z = [1, Ln/m]] 4. By 
Theorem 12, [1, Ln/m]] % =[m, n] S. So by Theorem 4, S = [1,1] I c 
[1,2] ofl, Lanm] F. l 
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COROLLARY 14. For all ceN*t, m<neN*, and I €{ EX”, EX", ..., 
EX*, BC}, [m,n] S = [cm, cn] I. 


Proof. Immediate, since | n/m |= ĻLcn/cem ]. 


As further corollaries of Theorem 12, we have two results from (Osher- 
son, 1986b), the second of which originally appeared in (Blum, 1975). 


CoROLLARY 15. [2,3] EXS EX. 


CoroLLaRY 16. [1,2] EX £ EX. 


IV. PLURALITY VERSUS PROBABILITY 


The values 4,4, ..., I/n, ... play a special role in probabilistic, team, and 
aggregate inference. In a sense, these values may be viewed as possible 
measures of uncertainty of a given probabilistic inference machine, or of a 
team of deterministic inference machines. The same phenomenon shows up 
in probabilistic limiting calculations (Freivalds, 1975). A compelling con- 
jecture, based on Theorem 8, other results in (Pitt, 1984) and the results of 
the last section, is that these are the only possible measures of uncertainty 
for any reasonable model of inference. Thus we expect that a team of 
probabilistic inference machines is no different than some team of deter- 
ministic machines, or some single probabilistic machine. From 
Proposition 11, we know that [m,n] <p) = 4<(m/n) -p), and in light of 
our conjecture it seems reasonable that containment would hold in the 
other direction, thus for every m<neN* and 0<p<1 we would have 
that [m,n] <p = 4%<(m/n)-p>. Our results in this section demonstrate 
that while our conjecture that the measures of uncertainty are exactly the 
values 3,4,..., 1/n,... remains intact, (ie. every class [m,n] %<p)> is 
equivalent to some class [1, 7’].%, or equivalently, some class .%<1/n'>), 
the expectation that [m,n] %< p> =.4%<(m/n)-p> fails, and a more subtle 
relationship exists between teams of probabilistic machines and deter- 
ministic teams or single probabilistic machines. 

The following lemma states that in some special cases however, 
[m,n] %<p> is in fact equal to ¥<(m/n)-p>, in particular, whenever p is 
itself some measure of uncertainty 1/k (and hence p = IN(p)). 


Lemma 17. For all m<neN*, for all 0<p<}, for all $e {EX°, 
EX", .... EX*, BC}, [m,n] #<p> = [m,n] F<IN(p)>) = #<(m/n)- 
IN(p)>. 


Proof. The first equality is easily shown: Since IN(p)>p, we 
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immediately have [m,n].%<(p)>[m,n] ¥<IN(p)>. To see that 
[m,n] §<p>S[m,n] FCIN(p)>, let Se[m,n] <p) be witnessed by 
the team M,, M,,..,M,. For each i (l<i<n), ¥<p>(MjeE%<p>, by 
definition. By Proposition 10, %< p> =.%<IN(p)>, so for each i we have 
JF (py{M Je FCIN(p)>. Let M; witness that %<p>(M,}e %<IN(p)), 
ie, S£Cp>(M)SFCIN(p)>(M;). Then if feS, there exists 
ISi <i <: <i sn such that for each 1<j<m, fe %<p>(M,)S 
JF <CIN(p)>(M;,). Thus Mi, M3, .... M;, witness that Se [m,n] .%<IN(p)>. 

To prove the second equality note that [m,n].#<IN(p)>¢ 
F<(m/n)-IN(p)> immediately from Proposition 11. We show that 
[m,n] #<IN(p)> 2 F <(m/n) -IN(p)>. 

Let IN(p)=1/k. Then we'll show [m,n] %(1/k>>4%<m/nk>. Let 
t be the unique positive integer such that 1/(t+1)<m/nk <1/t. 
Then #%<m/nk>=[1,t] ¥ by Theorem 8. Let Se.¥<(m/nk> and let 
M, M,,..., M, witness that Se[1, +] .%. We have two cases: 


Case 1. ket. Let M be the probabilistic amalgamation of 
M,,M),..,.M,. Then for all fe S, the probability that M is correct on 
input f > 1/t > 1/k. By making n copies of M we have a team of n machines, 
all of which identify S with probability at least t/k. Thus 
Se[n, n] %<i/k> E [m,n] FC1/k>. 


Case 2. k<t. Consider the sequence of IIMs N,, N2,.., Ngn where 
Ni = Mimoar For each j (O< j<n—1), let the probabilistic machine P, be 
the probabilistic amalgamation of Nyj41, Nj+2s 05 Nj- We show that 
Po, Pi,- P,_, witness that Se [m,n] .%<1/k>. We must show that for 
any fes, at least m of the machines Po, P;,.., P,—, identify f with 
probability at least 1/k. If fe S is given, then there is some x (1 <x<(t), 
such that fe I(M,). Consequently, for each O< j<n—1, if M, is one of 
the k machines from which P; is amalgamated (ic, M,e 
{Nario Ngaa- Ngak) then P, identifies f with probability at least 1/k. 
Call such a P; a good amalgamation. We show that there are at least m 
good amalgamations. Since t>k, for each j, at most one of {kj+1, 
kj+2,..., kj+ k} can be congruent to x mod t. Thus each occurrence of M, 
in the sequence Ni, N3, . Ngn occurs in a different probabilistic 
amalgamation. Therefore, the number of good amalgamations is at least 
the number of occurences of M,. But M, occurs at least | An/t_| times in 
the sequence N,,N>,..,N,,, and since m/nk<1/t, we have that the 
number of good amalgamations is at least |kn/t]>m. Thus 
Selm, n] %Ci/k>. I 


The following theorem completely characterizes the relationship 
between any probabilistic class %< p> and a “team-probabilistic” class 


[m,n] 5 <x/y >. 
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THEOREM 18. For all J e{EX°, EX’,.., EX*, BC}, for all positive 
integers m,n, x, y and real p between 0 and 1, if m<n and x< y then: 


(a) #¢p>S[m, n] F<x/y) > IN(p) > IN((m/n) - IN(x/y)). 
(b) #<p> 2 [m,n] F<x/y> = IN(p) <IN((m/n) - IN(x/y)). 


Proof. Suppose all the relevant variables satisfy the hypothesis. By 


Lemma 17, 
x m xX 
pm nly (3) = (Z-n (2). 


Thus, by Proposition 10, 


I<py Elm, nls (=) =s ("IN E) iff IN(p)> IN (Zm ()). 
y. n y n y 


Similarly, 
-IN (=)) iff IN(p) < IN (Zn (=)). | 
n y n y 


Note that for any class [m,n] S ¢x/y>, there is a value k such that 
1/k =IN((m/n) -IN(x/y)), so by Theorem 18, [m,n] £ <x/y y) = I <1/k Y = 
{1,4]4%. Thus every class of the form [m, n].%<x/y> is equivalent to some 
team or probabilistic class. 

Theorem 18 gives the means for comparing any two different com- 
binations of pluralism and probability. Returning to the example in Sec- 
tion I, we can determine which has more power—a team of 5 such that at 
least 3 are correct with probability 4, or a team of 7 such that at least 2 are 
correct with probability 2. By multiplying probabilities we would expect 
that they would be equivalent, since both teams are correct with “intuitive 
probability” $. But Theorem 18 tells us that # <$) = [3, 5] %<2), since 
IN(#%) = 3 = IN(z- 3) = IN(3-IN(3)). However, IN(?-IN(3))=4, hence 
ICH) = [3, 5] FCG) Z [2,7] FCS. 

This method gives us the power to completely characterize the 
relationship between [m, n] 4 <x/y and [m,n] ¥<x'/y’: 


I3 


X 
Ilp 2[m, n] I (=)= 
<p) 2 [m,n] = ( 


CoroLLaRY 19. For all J e {EX°, EX’, .... EX*, BC}, for all m,n, x, y, 
m,n’, x’, yENt, if m<n, x< y, m <n’, and x' < y’, then 


[m, nls (=) cim’, nis (Z) iff IN (Z-n 9) >IN (=m (=)). 
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Proof. Suppose all the relevant variables satisfy the hypothesis. By 


Lemma 17, 
mon (2)=2(m()) 


Hence, the result follows from clause (a) of Theorem 18. §f 


V. CRITERIA TRADE-OFFS 
The results of the previous sections all concerned pluralism and 
probability trade-offs for a fixed criterion of successful inference. 
Corollary 19 result can be further generalized to characterize trade-offs 
when the inference criterion also varies. The results of this section 


generalize (and summarize) many of the results in (Pitt, 1984; Smith, 
1982). 


THEOREM 20. For all a,a'éN, for all m,m',n,n'eN* with m<n and 
m' <n’, for all positive p, p' <1, 


[m, n] EX*< p) a [m,n] EX“ <p’) 


m m' ' 2 
IN (Z-mo)) >IN (=-IN(p re 4 |): 


Proof. Suppose a, a', m, m', n, n', p, and p’ satisfy the hypothesis. By 
applying Lemma 17 and then Corollary 9 we have 


[m, n] EX*(p) = EX" (Zm) ) 
= E 1/1N (= ! m) | EX". 
| n 


[m,n] EX*<p'> 


= E LIN (SN) fex 


Similarly, 
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Hence by Theorem 5, 
[m, n] EX*< p) E [m,n] EX*¢p’> 


> 1m2) > (1/m (.1nin)) (1 Hl) 


The theorem follows. f 


The above result does not address the cases when either a or a’ have 
value *. This is due solely to the notational complexity of stating such a 
theorem. Using the other parameters from the above theorem, and the 
same techniques, it is easy to see that [m,n] EX*< p) <[m’,n’] EX’ p'> 
iff a=» and IN((m/n)-IN(p))>IN((m'/n')-IN(p’)). Furthermore, for 
any a#«*, [m,n] EX*¢p)<[m',n'] EX*<p'> iff IN((m/n)-IN(p))> 
IN((m’/n')-IN(p’)). The trade offs with BC type inference are slightly more 
complicated to express. This is because the class BC contains sets that 
cannot be EX inferred by any probabilistic team. 


THEOREM 21. For all m<neN*, for all aeNvu{x*}, for all real 
positive p<1, BC £ [m,n] EX°< p>. 


Proof. Suppose a,m,n, and p satisfy the hypothesis. Let 
x= 1/IN((m/n)-p). By Proposition 11 and Corollary 9, [m,n] EX°< p> = 
EX*<(m/n)-p>=[1, x] EX". By definition, [1, x] EX" € [1, x] EX*. The 
theorem follows from Theorem 7. J 


COROLLARY 22. For all m<neNt, for all m'<n'eN*, for all 
aeNvu{«}, for all real positive p,p'’<1, [m,n] BC<p} £ 
[m,n] EX*¢p'). 


Proof. Suppose all the relevant variables satisfy the hypothesis. By 
Proposition 3 and definitions 


BC= [1,1] BC=[n,n] BCE [m,n] BCS [m,n] BCC p>. 


The result follows from Theorem 21. J 


Our final result characterizes precisely when a probabilistic EX team can 
be simulated by a probabilistic BC team. 


THEOREM 23. For all m<neéeNt, for all m'<n'eN*, for allaeNvu 
{x}, for all real positive p, p' <1, [m,n] EX*< p> = [m,n] BC<p’> iff 
IN((m/n) -IN(p)) > IN((m'/n‘) -IN(p’)). 
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Proof. By Lemma 17 and Corollary 9 


[m, n] EX*< py = EX" (z . IN(p)) 


= E t/m (Zmo)]| EX". 


Similarly 
[m,n] BC<p') = (i 1/IN (= 1’) | BC. 
By Theorem 6, 


[m,n] EX*< p>) S [m,n] BC< p> 
= 1/IN (= IN) > 1/IN (=-1N(»)), 


The theorem follows. § 


VI. CONCLUSIONS 


A new notion of inference by a team of deterministic machines 
(Lm, n].%) was introduced for a variety of inference criteria .%. This new 
type of inference was related to pluralism, a previous definition of team 
inference ([1, n] A). The relationships exposed answer questions raised in 
(Osherson, 1986b) about aggregations of inference machines. By consider- 
ing teams of probabilistic inference machines, a subtle difference between 
pluralistic and probabilistic inference was discovered, in that one can trade 
pluralism for probabilism (or vice versa) in general only when one trades 
all of the pluralism for probabilism (or vice versa). 

The same probability versus plurality trade-off formula was found for 
comparing inference via an EX type criterion with inference via the BC 
criterion (Theorem 23) as was found in Corollary 19 for a fixed inference 
criterion. This would indicate that plurality and probability are factors in 
the inference process that completely obscure the effects of tolerating 
finitely many errors in the inferred program. Plurality by itself is enough 
to hide the effects of error tolerance (Theorems 5 and 6). However, 
Corollary 19 indicates an interaction, modulated by the notion of interval, 
between pluralism and probabilism. 

The effects of tolerating anomalies are evident when comparing one 
probabilistic EX team with another. The trade-off formula of Theorem 20 
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turned out to be a straightforward combination of the trade-off formulae of 
Theorem 5 and Corollary 19. It is easy to verify that Theorem 20 is a direct 
generalization of Theorem 5 (set m=m'=n=n' = p= p'=1) and also of 
Corollary 19 (set a=a, p=x/y, and p’=x'/y'). Consequently, 
Theorems 20 and 23 comprise a summary and a generalization of much of 
the material in (Pitt, 1984; Smith, 1982). We have left open the problem of 
finding an analog of Theorem 20 for the BC anomaly heirarchy. If 
Theorem 8 could be extended to include the BC anomaly hierarchy then 
the results and techniques of this paper coupled with the work of Daley 
(1983) would yield a solution. We have also left open the problem of 
aggregating collections of probabilistic IIMs where each IIM is correct 
with a different probability. An analogous open problem is that of charac- 
terizing the power of teams of IIMs where each member of the team is 
judged according to a different inference criterion. 
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